Quantification of population structure using correlated SNPs by shrinkage principal components.
نویسندگان
چکیده
BACKGROUND/AIMS Association studies using unrelated individuals have become the most popular design for mapping complex traits. One of the major challenges of association mapping is avoiding spurious association due to population stratification. Principal component analysis (PCA) on genome-wide marker genotypes is one of the most popular population stratification control methods. It implicitly assumes that the markers are in linkage equilibrium, a condition that is rarely satisfied and that we plan to relax. METHODS We carefully examined the impact of linkage disequilibrium (LD) on PCA, and proposed a simple modification of the standard PCA to automatically adjust for the correlations among markers. RESULTS We demonstrated that LD patterns in genome-wide association datasets can distort the techniques for stratification control, showing 'subpopulations' reflecting localized LD phenomena rather than plausible population structure. We showed that the proposed method effectively removes the artifactual effect of LD patterns, and successfully recovers underlying population structure that is not apparent from standard PCA. CONCLUSION PCA is highly influenced by sets of SNPs with high LD, obscuring the true population substructure. Our shrinkage PCA applies to all available markers, regardless of the LD patterns. The proposed method is easier to implement than most existing LD adjusted PCA methods.
منابع مشابه
PCA-Correlated SNPs for Structure Identification in Worldwide Human Populations
Existing methods to ascertain small sets of markers for the identification of human population structure require prior knowledge of individual ancestry. Based on Principal Components Analysis (PCA), and recent results in theoretical computer science, we present a novel algorithm that, applied on genomewide data, selects small subsets of SNPs (PCA-correlated SNPs) to reproduce the structure foun...
متن کاملAncestral Informative Marker Selection and Population Structure Visualization Using Sparse Laplacian Eigenfunctions
Identification of a small panel of population structure informative markers can reduce genotyping cost and is useful in various applications, such as ancestry inference in association mapping, forensics and evolutionary theory in population genetics. Traditional methods to ascertain ancestral informative markers usually require the prior knowledge of individual ancestry and have difficulty for ...
متن کاملبررسی ساختار جمعیتی گاوهای بومی ایران با استفاده از تحلیل افتراقی مؤلفههای اصلی
Effective management of genetic resources in the domestic animals is based on characterization of genetic structure and diversity among populations. Strategies reducing complexity and dimensions of data are required to analyze the genetic relationships between populations based on dense genomic data. The objective of this study was to use the discriminant analysis of principal components (DAPC)...
متن کاملTracing Sub-Structure in the European American Population with PCA-Informative Markers
Genetic structure in the European American population reflects waves of migration and recent gene flow among different populations. This complex structure can introduce bias in genetic association studies. Using Principal Components Analysis (PCA), we analyze the structure of two independent European American datasets (1,521 individuals-307,315 autosomal SNPs). Individual variation lies across ...
متن کاملAnalysis of genetic diversity, phylogenetic relationships and population structure of Arasbaran cornelian cherry (Cornus mas L.) genotypes using ISSR molecular markers
Cornelian cherry (Cornus mas L.), considered as the ancestor of cultivated trees in Arasbaran region, is a medicinally and economically plant species. However, little is known about genetic diversity, breeding programs, and population structure of this species in mentioned region. Keeping this in view, the main objectives of present study were to analysis the genetic diversity, phyloge...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Human heredity
دوره 70 1 شماره
صفحات -
تاریخ انتشار 2010